Overview

Dataset statistics

Number of variables32
Number of observations761
Missing cells173
Missing cells (%)0.7%
Duplicate rows19
Duplicate rows (%)2.5%
Total size in memory327.7 KiB
Average record size in memory441.0 B

Variable types

Numeric8
Categorical24

Alerts

Dataset has 19 (2.5%) duplicate rowsDuplicates
Age is highly overall correlated with Num of pregnanciesHigh correlation
Biopsy is highly overall correlated with Hinselmann and 1 other fieldsHigh correlation
Dx is highly overall correlated with Dx:Cancer and 1 other fieldsHigh correlation
Dx:Cancer is highly overall correlated with Dx and 1 other fieldsHigh correlation
Dx:HPV is highly overall correlated with Dx and 1 other fieldsHigh correlation
Hinselmann is highly overall correlated with Biopsy and 1 other fieldsHigh correlation
Hormonal Contraceptives is highly overall correlated with Hormonal Contraceptives (years)High correlation
Hormonal Contraceptives (years) is highly overall correlated with Hormonal ContraceptivesHigh correlation
IUD is highly overall correlated with IUD (years)High correlation
IUD (years) is highly overall correlated with IUDHigh correlation
Num of pregnancies is highly overall correlated with AgeHigh correlation
STDs is highly overall correlated with STDs (number) and 3 other fieldsHigh correlation
STDs (number) is highly overall correlated with STDs and 6 other fieldsHigh correlation
STDs: Number of diagnosis is highly overall correlated with STDs and 4 other fieldsHigh correlation
STDs:HIV is highly overall correlated with STDs (number) and 1 other fieldsHigh correlation
STDs:condylomatosis is highly overall correlated with STDs and 3 other fieldsHigh correlation
STDs:syphilis is highly overall correlated with STDs (number)High correlation
STDs:vaginal condylomatosis is highly overall correlated with STDs (number)High correlation
STDs:vulvo-perineal condylomatosis is highly overall correlated with STDs and 3 other fieldsHigh correlation
Schiller is highly overall correlated with Biopsy and 1 other fieldsHigh correlation
Smokes is highly overall correlated with Smokes (packs/year) and 1 other fieldsHigh correlation
Smokes (packs/year) is highly overall correlated with Smokes and 1 other fieldsHigh correlation
Smokes (years) is highly overall correlated with Smokes and 1 other fieldsHigh correlation
IUD is highly imbalanced (53.6%)Imbalance
STDs is highly imbalanced (55.3%)Imbalance
STDs (number) is highly imbalanced (75.0%)Imbalance
STDs:condylomatosis is highly imbalanced (70.8%)Imbalance
STDs:vaginal condylomatosis is highly imbalanced (95.3%)Imbalance
STDs:vulvo-perineal condylomatosis is highly imbalanced (71.4%)Imbalance
STDs:syphilis is highly imbalanced (84.6%)Imbalance
STDs:pelvic inflammatory disease is highly imbalanced (98.6%)Imbalance
STDs:genital herpes is highly imbalanced (98.6%)Imbalance
STDs:molluscum contagiosum is highly imbalanced (98.6%)Imbalance
STDs:HIV is highly imbalanced (86.0%)Imbalance
STDs:Hepatitis B is highly imbalanced (98.6%)Imbalance
STDs:HPV is highly imbalanced (97.4%)Imbalance
STDs: Number of diagnosis is highly imbalanced (78.2%)Imbalance
Dx:Cancer is highly imbalanced (84.6%)Imbalance
Dx:CIN is highly imbalanced (95.3%)Imbalance
Dx:HPV is highly imbalanced (84.6%)Imbalance
Dx is highly imbalanced (83.9%)Imbalance
Hinselmann is highly imbalanced (76.0%)Imbalance
Schiller is highly imbalanced (58.4%)Imbalance
Citology is highly imbalanced (69.2%)Imbalance
Biopsy is highly imbalanced (67.1%)Imbalance
Hormonal Contraceptives (years) has 85 (11.2%) missing valuesMissing
IUD (years) has 88 (11.6%) missing valuesMissing
Num of pregnancies has 14 (1.8%) zerosZeros
Smokes (years) has 650 (85.4%) zerosZeros
Smokes (packs/year) has 650 (85.4%) zerosZeros
Hormonal Contraceptives (years) has 240 (31.5%) zerosZeros
IUD (years) has 598 (78.6%) zerosZeros

Reproduction

Analysis started2024-07-26 14:16:07.265827
Analysis finished2024-07-26 14:16:12.275987
Duration5.01 seconds
Software versionydata-profiling vv4.7.0
Download configurationconfig.json

Variables

Age
Real number (ℝ)

HIGH CORRELATION 

Distinct43
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.863338
Minimum13
Maximum84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.9 KiB
2024-07-26T08:16:12.396564image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile16
Q120
median25
Q332
95-th percentile41
Maximum84
Range71
Interquartile range (IQR)12

Descriptive statistics

Standard deviation8.5006027
Coefficient of variation (CV)0.31643881
Kurtosis5.2562488
Mean26.863338
Median Absolute Deviation (MAD)5
Skewness1.4624457
Sum20443
Variance72.260246
MonotonicityNot monotonic
2024-07-26T08:16:12.475134image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%)
23 52
 
6.8%
18 42
 
5.5%
20 41
 
5.4%
21 41
 
5.4%
19 40
 
5.3%
24 36
 
4.7%
28 34
 
4.5%
25 33
 
4.3%
26 32
 
4.2%
30 32
 
4.2%
Other values (33) 378
49.7%
ValueCountFrequency (%)
13 1
 
0.1%
14 2
 
0.3%
15 20
2.6%
16 19
2.5%
17 30
3.9%
18 42
5.5%
19 40
5.3%
20 41
5.4%
21 41
5.4%
22 27
3.5%
ValueCountFrequency (%)
84 1
 
0.1%
79 1
 
0.1%
70 2
0.3%
52 2
0.3%
51 1
 
0.1%
50 1
 
0.1%
49 2
0.3%
48 2
0.3%
47 1
 
0.1%
46 3
0.4%

Number of sexual partners
Real number (ℝ)

Distinct10
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5282523
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.9 KiB
2024-07-26T08:16:12.535531image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile5
Maximum28
Range27
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.6277376
Coefficient of variation (CV)0.64381929
Kurtosis79.092546
Mean2.5282523
Median Absolute Deviation (MAD)1
Skewness5.7345672
Sum1924
Variance2.6495297
MonotonicityNot monotonic
2024-07-26T08:16:12.577461image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2 247
32.5%
3 195
25.6%
1 185
24.3%
4 73
 
9.6%
5 39
 
5.1%
6 9
 
1.2%
7 7
 
0.9%
8 4
 
0.5%
10 1
 
0.1%
28 1
 
0.1%
ValueCountFrequency (%)
1 185
24.3%
2 247
32.5%
3 195
25.6%
4 73
 
9.6%
5 39
 
5.1%
6 9
 
1.2%
7 7
 
0.9%
8 4
 
0.5%
10 1
 
0.1%
28 1
 
0.1%
ValueCountFrequency (%)
28 1
 
0.1%
10 1
 
0.1%
8 4
 
0.5%
7 7
 
0.9%
6 9
 
1.2%
5 39
 
5.1%
4 73
 
9.6%
3 195
25.6%
2 247
32.5%
1 185
24.3%

First sexual intercourse
Real number (ℝ)

Distinct21
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.038108
Minimum10
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.9 KiB
2024-07-26T08:16:12.618388image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile14
Q115
median17
Q318
95-th percentile22
Maximum32
Range22
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.8330509
Coefficient of variation (CV)0.16627732
Kurtosis4.3464585
Mean17.038108
Median Absolute Deviation (MAD)2
Skewness1.6157906
Sum12966
Variance8.0261775
MonotonicityNot monotonic
2024-07-26T08:16:12.659629image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
15 153
20.1%
17 133
17.5%
18 122
16.0%
16 103
13.5%
14 72
9.5%
19 56
 
7.4%
20 34
 
4.5%
13 19
 
2.5%
21 18
 
2.4%
23 8
 
1.1%
Other values (11) 43
 
5.7%
ValueCountFrequency (%)
10 2
 
0.3%
11 1
 
0.1%
12 4
 
0.5%
13 19
 
2.5%
14 72
9.5%
15 153
20.1%
16 103
13.5%
17 133
17.5%
18 122
16.0%
19 56
 
7.4%
ValueCountFrequency (%)
32 1
 
0.1%
29 5
 
0.7%
28 3
 
0.4%
27 5
 
0.7%
26 7
 
0.9%
25 2
 
0.3%
24 6
 
0.8%
23 8
1.1%
22 7
 
0.9%
21 18
2.4%

Num of pregnancies
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct11
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2838371
Minimum0
Maximum11
Zeros14
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size11.9 KiB
2024-07-26T08:16:12.703503image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q33
95-th percentile5
Maximum11
Range11
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4566956
Coefficient of variation (CV)0.63782817
Kurtosis3.270227
Mean2.2838371
Median Absolute Deviation (MAD)1
Skewness1.4423115
Sum1738
Variance2.1219621
MonotonicityNot monotonic
2024-07-26T08:16:12.743418image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 258
33.9%
2 225
29.6%
3 133
17.5%
4 71
 
9.3%
5 32
 
4.2%
6 18
 
2.4%
0 14
 
1.8%
7 6
 
0.8%
8 2
 
0.3%
11 1
 
0.1%
ValueCountFrequency (%)
0 14
 
1.8%
1 258
33.9%
2 225
29.6%
3 133
17.5%
4 71
 
9.3%
5 32
 
4.2%
6 18
 
2.4%
7 6
 
0.8%
8 2
 
0.3%
10 1
 
0.1%
ValueCountFrequency (%)
11 1
 
0.1%
10 1
 
0.1%
8 2
 
0.3%
7 6
 
0.8%
6 18
 
2.4%
5 32
 
4.2%
4 71
 
9.3%
3 133
17.5%
2 225
29.6%
1 258
33.9%

Smokes
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
650 
1.0
111 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 650
85.4%
1.0 111
 
14.6%

Length

2024-07-26T08:16:12.785823image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:12.822973image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 650
85.4%
1.0 111
 
14.6%

Most occurring characters

ValueCountFrequency (%)
0 1411
61.8%
. 761
33.3%
1 111
 
4.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1411
61.8%
. 761
33.3%
1 111
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1411
61.8%
. 761
33.3%
1 111
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1411
61.8%
. 761
33.3%
1 111
 
4.9%

Smokes (years)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct30
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2232111
Minimum0
Maximum37
Zeros650
Zeros (%)85.4%
Negative0
Negative (%)0.0%
Memory size11.9 KiB
2024-07-26T08:16:12.865022image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile9
Maximum37
Range37
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.1057398
Coefficient of variation (CV)3.3565258
Kurtosis24.658441
Mean1.2232111
Median Absolute Deviation (MAD)0
Skewness4.5306848
Sum930.86367
Variance16.857099
MonotonicityNot monotonic
2024-07-26T08:16:12.911717image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
0 650
85.4%
1.266972909 12
 
1.6%
5 8
 
1.1%
9 8
 
1.1%
1 7
 
0.9%
3 7
 
0.9%
2 7
 
0.9%
7 6
 
0.8%
16 6
 
0.8%
8 6
 
0.8%
Other values (20) 44
 
5.8%
ValueCountFrequency (%)
0 650
85.4%
0.16 1
 
0.1%
0.5 3
 
0.4%
1 7
 
0.9%
1.266972909 12
 
1.6%
2 7
 
0.9%
3 7
 
0.9%
4 4
 
0.5%
5 8
 
1.1%
6 4
 
0.5%
ValueCountFrequency (%)
37 1
0.1%
34 1
0.1%
32 1
0.1%
28 1
0.1%
24 1
0.1%
22 1
0.1%
21 1
0.1%
20 1
0.1%
19 2
0.3%
18 1
0.1%

Smokes (packs/year)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct56
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.44915038
Minimum0
Maximum37
Zeros650
Zeros (%)85.4%
Negative0
Negative (%)0.0%
Memory size11.9 KiB
2024-07-26T08:16:12.962428image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2.4
Maximum37
Range37
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.2542472
Coefficient of variation (CV)5.0189141
Kurtosis119.25579
Mean0.44915038
Median Absolute Deviation (MAD)0
Skewness9.554199
Sum341.80344
Variance5.0816303
MonotonicityNot monotonic
2024-07-26T08:16:13.019863image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 650
85.4%
0.5132021277 17
 
2.2%
3 5
 
0.7%
2 4
 
0.5%
0.75 4
 
0.5%
1.2 4
 
0.5%
0.2 4
 
0.5%
0.05 4
 
0.5%
1 4
 
0.5%
0.1 3
 
0.4%
Other values (46) 62
 
8.1%
ValueCountFrequency (%)
0 650
85.4%
0.001 1
 
0.1%
0.003 1
 
0.1%
0.025 1
 
0.1%
0.04 1
 
0.1%
0.05 4
 
0.5%
0.1 3
 
0.4%
0.15 1
 
0.1%
0.16 2
 
0.3%
0.2 4
 
0.5%
ValueCountFrequency (%)
37 1
 
0.1%
22 1
 
0.1%
21 1
 
0.1%
19 1
 
0.1%
12 3
0.4%
9 2
0.3%
8 2
0.3%
7.5 1
 
0.1%
7 2
0.3%
6 3
0.4%

Hormonal Contraceptives
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
1.0
436 
0.0
325 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0 436
57.3%
0.0 325
42.7%

Length

2024-07-26T08:16:13.069260image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:13.106306image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
1.0 436
57.3%
0.0 325
42.7%

Most occurring characters

ValueCountFrequency (%)
0 1086
47.6%
. 761
33.3%
1 436
19.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1086
47.6%
. 761
33.3%
1 436
19.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1086
47.6%
. 761
33.3%
1 436
19.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1086
47.6%
. 761
33.3%
1 436
19.1%

Hormonal Contraceptives (years)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct39
Distinct (%)5.8%
Missing85
Missing (%)11.2%
Infinite0
Infinite (%)0.0%
Mean2.3317225
Minimum0
Maximum30
Zeros240
Zeros (%)31.5%
Negative0
Negative (%)0.0%
Memory size11.9 KiB
2024-07-26T08:16:13.145905image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.5
Q33
95-th percentile10
Maximum30
Range30
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.8613217
Coefficient of variation (CV)1.6559954
Kurtosis8.7035047
Mean2.3317225
Median Absolute Deviation (MAD)0.5
Skewness2.5878372
Sum1576.2444
Variance14.909805
MonotonicityNot monotonic
2024-07-26T08:16:13.198931image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
0 240
31.5%
1 73
 
9.6%
2 36
 
4.7%
0.25 34
 
4.5%
5 32
 
4.2%
3 31
 
4.1%
0.5 23
 
3.0%
0.08 23
 
3.0%
6 22
 
2.9%
4 22
 
2.9%
Other values (29) 140
18.4%
(Missing) 85
 
11.2%
ValueCountFrequency (%)
0 240
31.5%
0.08 23
 
3.0%
0.16 16
 
2.1%
0.17 1
 
0.1%
0.25 34
 
4.5%
0.33 8
 
1.1%
0.41 1
 
0.1%
0.42 6
 
0.8%
0.5 23
 
3.0%
0.58 5
 
0.7%
ValueCountFrequency (%)
30 1
 
0.1%
22 1
 
0.1%
20 4
0.5%
19 2
 
0.3%
17 1
 
0.1%
16 2
 
0.3%
15 6
0.8%
14 1
 
0.1%
13 2
 
0.3%
12 4
0.5%

IUD
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
686 
1.0
75 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 686
90.1%
1.0 75
 
9.9%

Length

2024-07-26T08:16:13.246622image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:13.283873image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 686
90.1%
1.0 75
 
9.9%

Most occurring characters

ValueCountFrequency (%)
0 1447
63.4%
. 761
33.3%
1 75
 
3.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1447
63.4%
. 761
33.3%
1 75
 
3.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1447
63.4%
. 761
33.3%
1 75
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1447
63.4%
. 761
33.3%
1 75
 
3.3%

IUD (years)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct25
Distinct (%)3.7%
Missing88
Missing (%)11.6%
Infinite0
Infinite (%)0.0%
Mean0.52609212
Minimum0
Maximum19
Zeros598
Zeros (%)78.6%
Negative0
Negative (%)0.0%
Memory size11.9 KiB
2024-07-26T08:16:13.321420image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum19
Range19
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.9943689
Coefficient of variation (CV)3.790912
Kurtosis29.398766
Mean0.52609212
Median Absolute Deviation (MAD)0
Skewness4.9824462
Sum354.06
Variance3.9775075
MonotonicityNot monotonic
2024-07-26T08:16:13.367784image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
0 598
78.6%
3 11
 
1.4%
2 8
 
1.1%
5 7
 
0.9%
7 7
 
0.9%
1 7
 
0.9%
8 7
 
0.9%
4 5
 
0.7%
6 3
 
0.4%
11 3
 
0.4%
Other values (15) 17
 
2.2%
(Missing) 88
 
11.6%
ValueCountFrequency (%)
0 598
78.6%
0.08 2
 
0.3%
0.16 1
 
0.1%
0.17 1
 
0.1%
0.25 1
 
0.1%
0.33 1
 
0.1%
0.5 2
 
0.3%
0.58 1
 
0.1%
0.91 1
 
0.1%
1 7
 
0.9%
ValueCountFrequency (%)
19 1
 
0.1%
17 1
 
0.1%
15 1
 
0.1%
12 1
 
0.1%
11 3
0.4%
10 1
 
0.1%
9 1
 
0.1%
8 7
0.9%
7 7
0.9%
6 3
0.4%

STDs
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
690 
1.0
71 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 690
90.7%
1.0 71
 
9.3%

Length

2024-07-26T08:16:13.414458image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:13.657073image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 690
90.7%
1.0 71
 
9.3%

Most occurring characters

ValueCountFrequency (%)
0 1451
63.6%
. 761
33.3%
1 71
 
3.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1451
63.6%
. 761
33.3%
1 71
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1451
63.6%
. 761
33.3%
1 71
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1451
63.6%
. 761
33.3%
1 71
 
3.1%

STDs (number)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
690 
2.0
 
33
1.0
 
31
3.0
 
6
4.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 690
90.7%
2.0 33
 
4.3%
1.0 31
 
4.1%
3.0 6
 
0.8%
4.0 1
 
0.1%

Length

2024-07-26T08:16:13.696266image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:13.735494image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 690
90.7%
2.0 33
 
4.3%
1.0 31
 
4.1%
3.0 6
 
0.8%
4.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1451
63.6%
. 761
33.3%
2 33
 
1.4%
1 31
 
1.4%
3 6
 
0.3%
4 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1451
63.6%
. 761
33.3%
2 33
 
1.4%
1 31
 
1.4%
3 6
 
0.3%
4 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1451
63.6%
. 761
33.3%
2 33
 
1.4%
1 31
 
1.4%
3 6
 
0.3%
4 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1451
63.6%
. 761
33.3%
2 33
 
1.4%
1 31
 
1.4%
3 6
 
0.3%
4 1
 
< 0.1%

STDs:condylomatosis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
722 
1.0
 
39

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 722
94.9%
1.0 39
 
5.1%

Length

2024-07-26T08:16:13.778989image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:13.816370image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 722
94.9%
1.0 39
 
5.1%

Most occurring characters

ValueCountFrequency (%)
0 1483
65.0%
. 761
33.3%
1 39
 
1.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1483
65.0%
. 761
33.3%
1 39
 
1.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1483
65.0%
. 761
33.3%
1 39
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1483
65.0%
. 761
33.3%
1 39
 
1.7%

STDs:vaginal condylomatosis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
757 
1.0
 
4

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 757
99.5%
1.0 4
 
0.5%

Length

2024-07-26T08:16:13.856345image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:13.894099image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 757
99.5%
1.0 4
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 1518
66.5%
. 761
33.3%
1 4
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1518
66.5%
. 761
33.3%
1 4
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1518
66.5%
. 761
33.3%
1 4
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1518
66.5%
. 761
33.3%
1 4
 
0.2%

STDs:vulvo-perineal condylomatosis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
723 
1.0
 
38

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 723
95.0%
1.0 38
 
5.0%

Length

2024-07-26T08:16:13.933055image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:13.969597image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 723
95.0%
1.0 38
 
5.0%

Most occurring characters

ValueCountFrequency (%)
0 1484
65.0%
. 761
33.3%
1 38
 
1.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1484
65.0%
. 761
33.3%
1 38
 
1.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1484
65.0%
. 761
33.3%
1 38
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1484
65.0%
. 761
33.3%
1 38
 
1.7%

STDs:syphilis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
744 
1.0
 
17

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 744
97.8%
1.0 17
 
2.2%

Length

2024-07-26T08:16:14.095720image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.192242image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 744
97.8%
1.0 17
 
2.2%

Most occurring characters

ValueCountFrequency (%)
0 1505
65.9%
. 761
33.3%
1 17
 
0.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1505
65.9%
. 761
33.3%
1 17
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1505
65.9%
. 761
33.3%
1 17
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1505
65.9%
. 761
33.3%
1 17
 
0.7%

STDs:pelvic inflammatory disease
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
760 
1.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 760
99.9%
1.0 1
 
0.1%

Length

2024-07-26T08:16:14.231858image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.269307image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 760
99.9%
1.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

STDs:genital herpes
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
760 
1.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 760
99.9%
1.0 1
 
0.1%

Length

2024-07-26T08:16:14.308426image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.344356image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 760
99.9%
1.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

STDs:molluscum contagiosum
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
760 
1.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 760
99.9%
1.0 1
 
0.1%

Length

2024-07-26T08:16:14.383115image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.419124image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 760
99.9%
1.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

STDs:HIV
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
746 
1.0
 
15

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 746
98.0%
1.0 15
 
2.0%

Length

2024-07-26T08:16:14.458269image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.496149image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 746
98.0%
1.0 15
 
2.0%

Most occurring characters

ValueCountFrequency (%)
0 1507
66.0%
. 761
33.3%
1 15
 
0.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1507
66.0%
. 761
33.3%
1 15
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1507
66.0%
. 761
33.3%
1 15
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1507
66.0%
. 761
33.3%
1 15
 
0.7%

STDs:Hepatitis B
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
760 
1.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 760
99.9%
1.0 1
 
0.1%

Length

2024-07-26T08:16:14.535158image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.571643image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 760
99.9%
1.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1521
66.6%
. 761
33.3%
1 1
 
< 0.1%

STDs:HPV
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
759 
1.0
 
2

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 759
99.7%
1.0 2
 
0.3%

Length

2024-07-26T08:16:14.611175image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.649564image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 759
99.7%
1.0 2
 
0.3%

Most occurring characters

ValueCountFrequency (%)
0 1520
66.6%
. 761
33.3%
1 2
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1520
66.6%
. 761
33.3%
1 2
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1520
66.6%
. 761
33.3%
1 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1520
66.6%
. 761
33.3%
1 2
 
0.1%

STDs: Number of diagnosis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size44.6 KiB
0.0
697 
1.0
 
62
3.0
 
1
2.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2283
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.3%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 697
91.6%
1.0 62
 
8.1%
3.0 1
 
0.1%
2.0 1
 
0.1%

Length

2024-07-26T08:16:14.687704image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.727434image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0 697
91.6%
1.0 62
 
8.1%
3.0 1
 
0.1%
2.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1458
63.9%
. 761
33.3%
1 62
 
2.7%
3 1
 
< 0.1%
2 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1458
63.9%
. 761
33.3%
1 62
 
2.7%
3 1
 
< 0.1%
2 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1458
63.9%
. 761
33.3%
1 62
 
2.7%
3 1
 
< 0.1%
2 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2283
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1458
63.9%
. 761
33.3%
1 62
 
2.7%
3 1
 
< 0.1%
2 1
 
< 0.1%

Dx:Cancer
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.1 KiB
0
744 
1
 
17

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters761
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Length

2024-07-26T08:16:14.774286image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.822579image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Most occurring characters

ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Dx:CIN
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.1 KiB
0
757 
1
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters761
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 757
99.5%
1 4
 
0.5%

Length

2024-07-26T08:16:14.861619image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.898782image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 757
99.5%
1 4
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 757
99.5%
1 4
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 757
99.5%
1 4
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 757
99.5%
1 4
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 757
99.5%
1 4
 
0.5%

Dx:HPV
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.1 KiB
0
744 
1
 
17

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters761
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Length

2024-07-26T08:16:14.937979image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:14.975275image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Most occurring characters

ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 744
97.8%
1 17
 
2.2%

Dx
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.1 KiB
0
743 
1
 
18

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters761
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 743
97.6%
1 18
 
2.4%

Length

2024-07-26T08:16:15.015836image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:15.052995image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 743
97.6%
1 18
 
2.4%

Most occurring characters

ValueCountFrequency (%)
0 743
97.6%
1 18
 
2.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 743
97.6%
1 18
 
2.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 743
97.6%
1 18
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 743
97.6%
1 18
 
2.4%

Hinselmann
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.1 KiB
0
731 
1
 
30

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters761
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 731
96.1%
1 30
 
3.9%

Length

2024-07-26T08:16:15.092161image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:15.129368image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 731
96.1%
1 30
 
3.9%

Most occurring characters

ValueCountFrequency (%)
0 731
96.1%
1 30
 
3.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 731
96.1%
1 30
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 731
96.1%
1 30
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 731
96.1%
1 30
 
3.9%

Schiller
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.1 KiB
0
697 
1
 
64

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters761
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 697
91.6%
1 64
 
8.4%

Length

2024-07-26T08:16:15.168329image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:15.212892image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 697
91.6%
1 64
 
8.4%

Most occurring characters

ValueCountFrequency (%)
0 697
91.6%
1 64
 
8.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 697
91.6%
1 64
 
8.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 697
91.6%
1 64
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 697
91.6%
1 64
 
8.4%

Citology
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.1 KiB
0
719 
1
 
42

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters761
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 719
94.5%
1 42
 
5.5%

Length

2024-07-26T08:16:15.263858image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:15.302804image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 719
94.5%
1 42
 
5.5%

Most occurring characters

ValueCountFrequency (%)
0 719
94.5%
1 42
 
5.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 719
94.5%
1 42
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 719
94.5%
1 42
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 719
94.5%
1 42
 
5.5%

Biopsy
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.1 KiB
0
715 
1
 
46

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters761
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 715
94.0%
1 46
 
6.0%

Length

2024-07-26T08:16:15.342385image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-26T08:16:15.379951image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 715
94.0%
1 46
 
6.0%

Most occurring characters

ValueCountFrequency (%)
0 715
94.0%
1 46
 
6.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 715
94.0%
1 46
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 715
94.0%
1 46
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 761
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 715
94.0%
1 46
 
6.0%

Interactions

2024-07-26T08:16:11.648633image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.192104image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.555566image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.110950image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.561689image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.945202image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.786837image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.198046image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.692559image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.238314image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.599293image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.168221image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.605492image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.988376image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.828563image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.243031image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.733293image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.285795image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.643848image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.224045image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.650994image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.030538image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.868906image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.285617image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.789598image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.344611image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.794487image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.289701image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.712685image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.211533image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.947999image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.355588image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.831119image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.387142image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.923215image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.343617image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.757066image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.281990image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.010194image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.398901image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.871877image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.428503image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.975415image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.402433image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.808520image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.331552image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.056796image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.441298image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.912814image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.470813image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.020154image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.459337image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.855817image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.373709image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.102476image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.486372image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.950414image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:08.513512image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.068271image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.510038image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:09.900526image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:10.742086image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.150268image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-07-26T08:16:11.598788image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Correlations

2024-07-26T08:16:15.424303image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
AgeBiopsyCitologyDxDx:CINDx:CancerDx:HPVFirst sexual intercourseHinselmannHormonal ContraceptivesHormonal Contraceptives (years)IUDIUD (years)Num of pregnanciesNumber of sexual partnersSTDsSTDs (number)STDs: Number of diagnosisSTDs:HIVSTDs:HPVSTDs:Hepatitis BSTDs:condylomatosisSTDs:genital herpesSTDs:molluscum contagiosumSTDs:pelvic inflammatory diseaseSTDs:syphilisSTDs:vaginal condylomatosisSTDs:vulvo-perineal condylomatosisSchillerSmokesSmokes (packs/year)Smokes (years)
Age1.0000.0680.0000.0000.0380.0790.0680.4430.0000.2150.2760.2620.2800.5380.1990.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0530.0000.0000.0970.0410.0440.054
Biopsy0.0681.0000.3110.1180.0000.1630.1630.0600.5000.0290.0230.0640.0720.0650.0140.0920.0980.0860.0520.0000.0000.0970.0560.0000.0000.0000.0000.1000.7270.0000.0220.021
Citology0.0000.3111.0000.0880.0000.0930.093-0.0040.1390.000-0.0090.0000.014-0.0370.0260.0360.0270.0430.0590.0000.0000.0490.0000.0000.0000.0000.0000.0520.3500.000-0.020-0.020
Dx0.0000.1180.0881.0000.4060.7070.6490.0690.0710.000-0.0080.0700.1030.0320.0440.0000.0000.0000.0000.0670.0000.0000.0000.0000.0000.0000.0000.0000.0500.037-0.064-0.064
Dx:CIN0.0380.0000.0000.4061.0000.0000.000-0.0480.0000.0000.0220.000-0.0240.0170.0480.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.000-0.030-0.030
Dx:Cancer0.0790.1630.0930.7070.0001.0000.8490.1010.1240.0000.0350.0760.0960.0480.0300.0000.0000.0000.0000.2500.0000.0000.0000.0000.0000.0000.0000.0000.1250.000-0.011-0.007
Dx:HPV0.0680.1630.0930.6490.0000.8491.0000.0660.1240.0000.0530.0000.0370.0660.0280.0000.0000.0000.0000.2500.0000.0000.0000.0000.0000.0000.0000.0000.1250.0000.0120.016
First sexual intercourse0.4430.060-0.0040.069-0.0480.1010.0661.0000.0000.0870.0710.000-0.033-0.016-0.1370.0440.0840.0000.0000.0000.0000.0000.0000.0000.0000.1760.3080.0000.0000.073-0.130-0.127
Hinselmann0.0000.5000.1390.0710.0000.1240.1240.0001.0000.0460.0200.0450.0580.048-0.0560.0160.1750.1690.0250.0000.0000.0480.0000.0000.0000.0000.0000.0500.6310.0000.0090.011
Hormonal Contraceptives0.2150.0290.0000.0000.0000.0000.0000.0870.0461.0000.8490.0770.0380.2230.0380.0000.0000.0000.0170.0000.0000.0000.0000.0000.0000.0000.0000.0000.0290.000-0.005-0.005
Hormonal Contraceptives (years)0.2760.023-0.009-0.0080.0220.0350.0530.0710.0200.8491.0000.1590.0510.2790.0640.0000.0000.0000.0000.1060.0000.0000.0000.0000.0000.0450.0000.0000.1500.0660.0500.050
IUD0.2620.0640.0000.0700.0000.0760.0000.0000.0450.0770.1591.0000.9980.2420.0800.0110.0810.0000.0000.0000.0000.0640.0000.0000.0000.0000.0000.0420.0910.023-0.043-0.041
IUD (years)0.2800.0720.0140.103-0.0240.0960.037-0.0330.0580.0380.0510.9981.0000.2450.0820.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.1830.000-0.043-0.040
Num of pregnancies0.5380.065-0.0370.0320.0170.0480.066-0.0160.0480.2230.2790.2420.2451.0000.1690.0600.0180.0000.0000.0000.0000.0560.0000.0000.2390.2120.0000.0420.1430.1090.0570.059
Number of sexual partners0.1990.0140.0260.0440.0480.0300.028-0.137-0.0560.0380.0640.0800.0820.1691.0000.0000.0380.0000.0580.0000.0000.0530.0000.0240.0240.0000.0000.0580.0000.2670.2400.236
STDs0.0000.0920.0360.0000.0000.0000.0000.0440.0160.0000.0000.0110.0000.0600.0001.0000.9980.9430.4250.1100.0350.7140.0350.0350.0350.4550.1920.7040.0820.0980.1130.112
STDs (number)0.0000.0980.0270.0000.0000.0000.0000.0840.1750.0000.0000.0810.0000.0180.0380.9981.0000.8240.5970.2380.1540.9850.1600.1600.1600.6920.6120.9720.1410.1110.1130.112
STDs: Number of diagnosis0.0000.0860.0430.0000.0000.0000.0000.0000.1690.0000.0000.0000.0000.0000.0000.9430.8241.0000.5460.0470.1040.7080.1040.1040.1040.4450.2360.6970.1330.0900.1070.105
STDs:HIV0.0000.0520.0590.0000.0000.0000.0000.0000.0250.0170.0000.0000.0000.0000.0580.4250.5970.5461.0000.0000.1200.0650.0000.0000.0000.0000.0000.0670.0670.0000.0570.054
STDs:HPV0.0000.0000.0000.0670.0000.2500.2500.0000.0000.0000.1060.0000.0000.0000.0000.1100.2380.0470.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0430.056
STDs:Hepatitis B0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0350.1540.1040.1200.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0030.0990.095
STDs:condylomatosis0.0000.0970.0490.0000.0000.0000.0000.0000.0480.0000.0000.0640.0000.0560.0530.7140.9850.7080.0650.0000.0001.0000.0000.0000.0000.0000.2690.9730.1060.0310.0580.058
STDs:genital herpes0.0000.0560.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0350.1600.1040.0000.0000.0000.0001.0000.0000.0000.0000.0000.0000.0000.000-0.015-0.015
STDs:molluscum contagiosum0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0240.0350.1600.1040.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.0000.000-0.015-0.015
STDs:pelvic inflammatory disease0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.2390.0240.0350.1600.1040.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.000-0.015-0.015
STDs:syphilis0.0530.0000.0000.0000.0000.0000.0000.1760.0000.0000.0450.0000.0000.2120.0000.4550.6920.4450.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0670.0860.082
STDs:vaginal condylomatosis0.0000.0000.0000.0000.0000.0000.0000.3080.0000.0000.0000.0000.0000.0000.0000.1920.6120.2360.0000.0000.0000.2690.0000.0000.0000.0001.0000.1890.0000.0300.0800.084
STDs:vulvo-perineal condylomatosis0.0000.1000.0520.0000.0000.0000.0000.0000.0500.0000.0000.0420.0000.0420.0580.7040.9720.6970.0670.0000.0000.9730.0000.0000.0000.0000.1891.0000.1100.0350.0610.061
Schiller0.0970.7270.3500.0500.0000.1250.1250.0000.6310.0290.1500.0910.1830.1430.0000.0820.1410.1330.0670.0000.0000.1060.0000.0000.0000.0000.0000.1101.0000.0000.0220.027
Smokes0.0410.0000.0000.0370.0000.0000.0000.0730.0000.0000.0660.0230.0000.1090.2670.0980.1110.0900.0000.0000.0030.0310.0000.0000.0000.0670.0300.0350.0001.0000.9960.996
Smokes (packs/year)0.0440.022-0.020-0.064-0.030-0.0110.012-0.1300.009-0.0050.050-0.043-0.0430.0570.2400.1130.1130.1070.0570.0430.0990.058-0.015-0.015-0.0150.0860.0800.0610.0220.9961.0000.997
Smokes (years)0.0540.021-0.020-0.064-0.030-0.0070.016-0.1270.011-0.0050.050-0.041-0.0400.0590.2360.1120.1120.1050.0540.0560.0950.058-0.015-0.015-0.0150.0820.0840.0610.0270.9960.9971.000

Missing values

2024-07-26T08:16:12.026726image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.
2024-07-26T08:16:12.183315image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

AgeNumber of sexual partnersFirst sexual intercourseNum of pregnanciesSmokesSmokes (years)Smokes (packs/year)Hormonal ContraceptivesHormonal Contraceptives (years)IUDIUD (years)STDsSTDs (number)STDs:condylomatosisSTDs:vaginal condylomatosisSTDs:vulvo-perineal condylomatosisSTDs:syphilisSTDs:pelvic inflammatory diseaseSTDs:genital herpesSTDs:molluscum contagiosumSTDs:HIVSTDs:Hepatitis BSTDs:HPVSTDs: Number of diagnosisDx:CancerDx:CINDx:HPVDxHinselmannSchillerCitologyBiopsy
0184.015.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
1151.014.01.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
3525.016.04.01.037.037.01.03.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.010100000
4463.021.04.00.00.00.01.015.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
5423.023.02.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
6513.017.06.01.034.03.40.00.01.07.00.00.00.00.00.00.00.00.00.00.00.00.00.000001101
7261.026.03.00.00.00.01.02.01.07.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
8451.020.05.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.010110000
10443.026.04.00.00.00.01.02.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
11271.017.03.00.00.00.01.08.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
AgeNumber of sexual partnersFirst sexual intercourseNum of pregnanciesSmokesSmokes (years)Smokes (packs/year)Hormonal ContraceptivesHormonal Contraceptives (years)IUDIUD (years)STDsSTDs (number)STDs:condylomatosisSTDs:vaginal condylomatosisSTDs:vulvo-perineal condylomatosisSTDs:syphilisSTDs:pelvic inflammatory diseaseSTDs:genital herpesSTDs:molluscum contagiosumSTDs:HIVSTDs:Hepatitis BSTDs:HPVSTDs: Number of diagnosisDx:CancerDx:CINDx:HPVDxHinselmannSchillerCitologyBiopsy
848313.018.01.00.00.00.001.00.500.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
849323.018.01.01.011.00.161.06.000.00.01.01.00.00.00.00.00.00.00.00.00.01.00.010100000
850191.014.00.00.00.00.000.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
851232.015.02.00.00.00.000.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
852433.017.03.00.00.00.001.05.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
853343.018.00.00.00.00.000.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
854322.019.01.00.00.00.001.08.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
855252.017.00.00.00.00.001.00.080.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000010
856332.024.02.00.00.00.001.00.080.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000
857292.020.01.00.00.00.001.00.500.00.00.00.00.00.00.00.00.00.00.00.00.00.00.000000000

Duplicate rows

Most frequently occurring

AgeNumber of sexual partnersFirst sexual intercourseNum of pregnanciesSmokesSmokes (years)Smokes (packs/year)Hormonal ContraceptivesHormonal Contraceptives (years)IUDIUD (years)STDsSTDs (number)STDs:condylomatosisSTDs:vaginal condylomatosisSTDs:vulvo-perineal condylomatosisSTDs:syphilisSTDs:pelvic inflammatory diseaseSTDs:genital herpesSTDs:molluscum contagiosumSTDs:HIVSTDs:Hepatitis BSTDs:HPVSTDs: Number of diagnosisDx:CancerDx:CINDx:HPVDxHinselmannSchillerCitologyBiopsy# duplicates
0151.014.01.00.00.00.00.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000004
7172.015.01.00.00.00.00.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000003
1151.015.01.00.00.00.00.0NaN0.0NaN0.00.00.00.00.00.00.00.00.00.00.00.00.0000000002
2152.014.01.00.00.00.00.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000002
3161.014.01.00.00.00.00.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000002
4161.015.01.00.00.00.00.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000002
5171.016.01.00.00.00.00.0NaN0.0NaN0.00.00.00.00.00.00.00.00.00.00.00.00.0000000002
6171.017.01.00.00.00.00.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000002
8172.015.01.00.00.00.01.00.330.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000002
9181.014.02.00.00.00.00.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000002